NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Real-time semantic segmentation on FPGAs for autonomous vehicles with hls4ml

https://doi.org/10.1088/2632-2153/ac9cb5

Ghielmetti, Nicolò; Loncar, Vladimir; Pierini, Maurizio; Roed, Marcel; Summers, Sioni; Aarrestad, Thea; Petersson, Christoffer; Linander, Hampus; Ngadiuba, Jennifer; Lin, Kelvin; et al (November 2022, Machine Learning: Science and Technology)

Abstract In this paper, we investigate how field programmable gate arrays can serve as hardware accelerators for real-time semantic segmentation tasks relevant for autonomous driving. Considering compressed versions of the ENet convolutional neural network architecture, we demonstrate a fully-on-chip deployment with a latency of 4.9 ms per image, using less than 30% of the available resources on a Xilinx ZCU102 evaluation board. The latency is reduced to 3 ms per image when increasing the batch size to ten, corresponding to the use case where the autonomous vehicle receives inputs from multiple cameras simultaneously. We show, through aggressive filter reduction and heterogeneous quantization-aware training, and an optimized implementation of convolutional layers, that the power consumption and resource utilization can be significantly reduced while maintaining accuracy on the Cityscapes dataset.
more » « less
Full Text Available
Fast convolutional neural networks on FPGAs with hls4ml

https://doi.org/10.1088/2632-2153/ac0ea1

Aarrestad, Thea; Loncar, Vladimir; Ghielmetti, Nicolò; Pierini, Maurizio; Summers, Sioni; Ngadiuba, Jennifer; Petersson, Christoffer; Linander, Hampus; Iiyama, Yutaro; Di Guglielmo, Giuseppe; et al (July 2021, Machine Learning: Science and Technology)
null (Ed.)
Full Text Available
Accelerated Charged Particle Tracking with Graph Neural Networks on FPGAs

Heinz, Aneesh; Razavimaleki, Vasall; Duarte, Javier; DeZoort, Gage; Ojalvo, Isobel; Thais, Savannah; Atkinson, Markus; Neubauer, Mark; Gray, Lindsey; Jindariani, Sergo; et al (November 2020, ArXivorg)
null (Ed.)
We develop and study FPGA implementations of algorithms for charged particle tracking based on graph neural networks. The two complementary FPGA designs are based on OpenCL, a framework for writing programs that execute across heterogeneous platforms, and hls4ml, a high-level-synthesis-based compiler for neural network to firmware conversion. We evaluate and compare the resource usage, latency, and tracking performance of our implementations based on a benchmark dataset. We find a considerable speedup over CPU-based execution is possible, potentially enabling such algorithms to be used effectively in future computing workflows and the FPGA-based Level-1 trigger at the CERN Large Hadron Collider.
more » « less
Full Text Available
hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices

Fahim, Farah; Hawks, Benjamin; Herwig, Christian; Hirschauer, James; Jindariani, Serge; Nhan, Trần; Carloni, Luca; DiGuglielmo, Giuseppe; Harris, Phillip; Krupa, Jeffrey; et al (April 2021, ArXivorg)
null (Ed.)
Accessible machine learning algorithms, software, and diagnostic tools for energy-efficient devices and systems are extremely valuable across a broad range of application domains. In scientific domains, real-time near-sensor processing can drastically improve experimental design and accelerate scientific discoveries. To support domain scientists, we have developed hls4ml, an open-source software-hardware codesign workflow to interpret and translate machine learning algorithms for implementation with both FPGA and ASIC technologies. We expand on previous hls4ml work by extending capabilities and techniques towards low-power implementations and increased usability: new Python APIs, quantization-aware pruning, end-to-end FPGA workflows, long pipeline kernels for low power, and new device backends include an ASIC workflow. Taken together, these and continued efforts in hls4ml will arm a new generation of domain scientists with accessible, efficient, and powerful tools for machine-learning-accelerated discovery.
more » « less
Full Text Available
The Dark Machines Anomaly Score Challenge: Benchmark Data and Model Independent Event Classification for the Large Hadron Collider

https://doi.org/10.21468/SciPostPhys.12.1.043

Aarrestad, Thea; van Beekveld, Melissa; Bona, Marcella; Boveia, Antonio; Caron, Sascha; Davies, Joe; de Simone, Andrea; Doglioni, Caterina; Duarte, Javier; Farbin, Amir; et al (January 2022, SciPost Physics)

We describe the outcome of a data challenge conducted as part of the Dark Machines (https://www.darkmachines.org) initiative and the Les Houches 2019 workshop on Physics at TeV colliders. The challenged aims to detect signals of new physics at the Large Hadron Collider (LHC) using unsupervised machine learning algorithms. First, we propose how an anomaly score could be implemented to define model-independent signal regions in LHC searches. We define and describe a large benchmark dataset, consisting of >1 billion simulated LHC events corresponding to 10\, fb^{-1} 10 f b − 1 of proton-proton collisions at a center-of-mass energy of 13 TeV. We then review a wide range of anomaly detection and density estimation algorithms, developed in the context of the data challenge, and we measure their performance in a set of realistic analysis environments. We draw a number of useful conclusions that will aid the development of unsupervised new physics searches during the third run of the LHC, and provide our benchmark dataset for future studies at https://www.phenoMLdata.org. Code to reproduce the analysis is provided at https://github.com/bostdiek/DarkMachines-UnsupervisedChallenge.
more » « less
Full Text Available
Applications and Techniques for Fast Machine Learning in Science

https://doi.org/10.3389/fdata.2022.787421

Deiana, Allison McCarn; Tran, Nhan; Agar, Joshua; Blott, Michaela; Di Guglielmo, Giuseppe; Duarte, Javier; Harris, Philip; Hauck, Scott; Liu, Mia; Neubauer, Mark S.; et al (April 2022, Frontiers in Big Data)

In this community review report, we discuss applications and techniques for fast machine learning (ML) in science—the concept of integrating powerful ML methods into the real-time experimental data processing loop to accelerate scientific discovery. The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML across a number of scientific domains; techniques for training and implementing performant and resource-efficient ML algorithms; and computing architectures, platforms, and technologies for deploying these algorithms. We also present overlapping challenges across the multiple scientific domains where common solutions can be found. This community report is intended to give plenty of examples and inspiration for scientific discovery through integrated and accelerated ML solutions. This is followed by a high-level overview and organization of technical advances, including an abundance of pointers to source material, which can enable these breakthroughs.
more » « less
Full Text Available
A new calibration method for charm jet identification validated with proton-proton collision events at √s = 13 TeV

https://doi.org/10.1088/1748-0221/17/03/P03014

Tumasyan, Armen; Adam, Wolfgang; Andrejkovic, Janik Walter; Bergauer, Thomas; Chatterjee, Suman; Dragicevic, Marko; Escalante Del Valle, Alberto; Fruehwirth, Rudolf; Jeitler, Manfred; Krammer, Natascha; et al (March 2022, Journal of Instrumentation)

Abstract Many measurements at the LHC require efficient identification of heavy-flavour jets, i.e. jets originating from bottom (b) or charm (c) quarks. An overview of the algorithms used to identify c jets is described and a novel method to calibrate them is presented. This new method adjusts the entire distributions of the outputs obtained when the algorithms are applied to jets of different flavours. It is based on an iterative approach exploiting three distinct control regions that are enriched with either b jets, c jets, or light-flavour and gluon jets. Results are presented in the form of correction factors evaluated using proton-proton collision data with an integrated luminosity of 41.5 fb -1 at √s = 13 TeV, collected by the CMS experiment in 2017. The closure of the method is tested by applying the measured correction factors on simulated data sets and checking the agreement between the adjusted simulation and collision data. Furthermore, a validation is performed by testing the method on pseudodata, which emulate various mismodelling conditions. The calibrated results enable the use of the full distributions of heavy-flavour identification algorithm outputs, e.g. as inputs to machine-learning models. Thus, they are expected to increase the sensitivity of future physics analyses.
more » « less
Full Text Available

Search for: All records